hybrid variance-reduced sgd algorithm
Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function
We develop a novel and single-loop variance-reduced algorithm to solve a class of stochastic nonconvex-convex minimax problems involving a nonconvex-linear objective function, which has various applications in different fields such as machine learning and robust optimization. This problem class has several computational challenges due to its nonsmoothness, nonconvexity, nonlinearity, and non-separability of the objective functions. Our approach relies on a new combination of recent ideas, including smoothing and hybrid biased variance-reduced techniques. Our algorithm and its variants can achieve $\mathcal{O}(T^{-2/3})$-convergence rate and the best-known oracle complexity under standard assumptions, where T is the iteration counter. They have several computational advantages compared to existing methods such as simple to implement and less parameter tuning requirements. They can also work with both single sample or mini-batch on derivative estimators, and with constant or diminishing step-sizes. We demonstrate the benefits of our algorithms over existing methods through two numerical examples, including a nonsmooth and nonconvex-non-strongly concave minimax model.
Review for NeurIPS paper: Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function
Additional Feedback: The main contribution of this paper is a single-loop stochastic algorithm which achieves the best-known complexity bound and outperforms the existing prox-linear algorithms in computational sense. To make this argument more convincing, I hope the authors can address the following concerns: 1. Clearly state why your algorithm outperforms prox-linear algorithms. Indeed, by simply exploring the structure of stochastic problem in Eq.(1), the prox-linear subproblem can be reformulated using conjugate function and becomes the same as your subproblem. Indeed, when b and \psi are in special forms, I agree that we can achieve the closed-form solution as you point out. However, this also holds true for prox-linear algorithms and a few other double-loop algorithms.
Review for NeurIPS paper: Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function
The paper introduces a single-loop stochastic algorithm for solving a special class of nonconvex-concave minimax problems that achieves best-known complexity bound. The rebuttal addressed most of the reviewers' concerns on the algorithmic justification, although some concern remains in terms of the special structure. However, please consider revising the paper to address R1 and R3 's remarks, in particular: - Adjust the title to reflect the special structure instead of overclaim the contribution; - Elaborate the desirable property of single-loop algorithm over existing methods; - Add detailed comparisons to prior work including prox-linear algorithms for compositional problems and recent algorithms for general nonconvex-concave minimax problems.
Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function
We develop a novel and single-loop variance-reduced algorithm to solve a class of stochastic nonconvex-convex minimax problems involving a nonconvex-linear objective function, which has various applications in different fields such as ma- chine learning and robust optimization. This problem class has several compu- tational challenges due to its nonsmoothness, nonconvexity, nonlinearity, and non-separability of the objective functions. Our approach relies on a new combi- nation of recent ideas, including smoothing and hybrid biased variance-reduced techniques. Our algorithm and its variants can achieve \mathcal{O}(T {-2/3}) -convergence rate and the best-known oracle complexity under standard assumptions, where T is the iteration counter. They have several computational advantages compared to exist- ing methods such as simple to implement and less parameter tuning requirements.